Techniques for Speeding up Range-Max Queries in OLAP Data Cubes
نویسندگان
چکیده
A range-max query obtains the maximum over all selected cells of a data cube where the selection is speci ed by providing ranges of values for numeric dimensions. Our general approach to speeding up range-max queries is to precompute and store certain key information of the data cube. In [HAMS97], we gave a tree algorithm based on precomputed max over balanced hierarchical tree structures; a branch-and-bound-[Mit70]like procedure was used to prune unnecessary search. In this paper, we propose three orthogonal techniques with the objective of improving the average response time of the range-max queries. First, rather than keeping only the index of the largest value at each internal node of the tree, we keep the indices of the t largest values with each internal node and use them to decrease the probability of scanning lower level nodes. Second, we further partition each sibling set of internal nodes into smaller groups and sort the precomputed indices within each group according to their indexed values. This speeds up the scanning of internal nodes at the same level and covered by the query region without increasing extra storage overhead. Third, we augment the tree with a precomputed reference array for each level of the tree (except for the leaf level). Elements of a reference array contain references to the next larger value, which are used to speed up the search. We compare our three algorithms with the previous algorithm both analytically and empirically. Based on these comparisons, we then propose and implement a hybrid algorithm, combining the advantages of these orthogonal techniques, that improves the empirically measured range-max query time by as much as 100%. We also give algorithms for incrementally updating the precomputed structures.
منابع مشابه
Ranking Aggregates
Ranking-aware queries have been gaining much attention recently in many applications such as search engines and data streams. They are, however, not only restricted to such applications but are also very useful in OLAP applications. In this paper, we introduce aggregation ranking queries in OLAP data cubes motivated by an online advertisement tracking data warehouse application. These queries a...
متن کاملProgressive Ranking of Range Aggregates
Ranking-aware queries have been gaining much attention recently in many applications such as search engines and data streams. They are, however, not only restricted to such applications but are also very useful in OLAP applications. In this paper, we introduce aggregation ranking queries in OLAP data cubes motivated by an online advertisement tracking data warehouse application. These queries a...
متن کاملRange Sum Queries in Dynamic OLAP Data Cubes
The data cube is frequently adopted to implement On-Line Analytical Processing (OLAP) and provides aggregate information to support the analysis of contents of databases and data warehouses. Range-sum queries require accessing large data cubes and adding the contents of massive cells immediately. Techniques have thus been proposed to accelerate range-sum queries by applying pre-aggregated speci...
متن کاملAnswering Approximate Range Aggregate Queries on OLAP Data Cubes with Probabilistic Guarantees
Approximate range aggregate queries are one of the most frequent and useful kinds of queries for Decision Support Systems (DSS). Traditionally, sampling-based techniques have been proposed to tackle this problem. However, its effectiveness will degrade when the underlying data distribution is skewed. Another approach based on the outlier management can limit the effect of data skew but fails to...
متن کاملThe Iterative Data Cube
Data cubes provide aggregate information to support the analysis of the contents of data warehouses and databases. An important tool to analyze data in data cubes is the range query. For range queries that summarize large regions of massive data cubes, computing the query result on-they can result in non-interactive response times (e.g. in the order of minutes). To speed up range queries, value...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997